AudioRadar: A Metaphorical Visualization for the Navigation of Large Music Collections

نویسندگان

Otmar Hilliges

Philipp Holzer

René Klüber

Andreas Butz

چکیده

Collections of electronic music are mostly organized according to playlists based on artist names and song titles. Music genres are inherently ambiguous and, to make matters worse, assigned manually by a diverse user community. People tend to organize music based on similarity to other music and based on the music’s emotional qualities. Taking this into account, we have designed a music player which derives a set of criteria from the actual music data and then provides a coherent visual metaphor for a similarity-based navigation of the music collection. 1 About Songs, Playlists and Genres In the January 27, 2006, edition of People magazine, reviewer Chuck Arnold likens new Australian duo The Veronicas to ‘such pop-rock princesses as Avril Lavigne, Hilary Duff and Ashlee Simpson.’ He goes on to state that ‘Everything I’m Not, a cut off their album The Secret Life Of... was produced by frequent Britney Spears collaborator Max Martin and has a chorus that echoes Kelly Clarkson’s Behind These Hazel Eyes.’ When we talk about music and try to explain its properties to others, we frequently use constructs describing similarity. One reason for this is that it is much easier to imagine how a piece of music might sound if we can relate it to a song we already know. This also makes it easier to decide whether or not we might like a song or record that is being discussed. Describing music by similarity to other music seems to work quite well and is widely used in the music press. However, state of the art digital music players like iTunes [2], Winamp [18] or XMMS [28] do not take this into account. All of these players organize digital music libraries using meta information about the songs/albums (e.g. artist, title) and/or a limited set of predefined genres. This works quite well as long as we know most of the songs that occur in a library and we know into what genre a song or artist fits. But this approach has several implicit problems: 1. Genres aren’t expressive enough to cover the breadth of an artist’s repertoire. Almost no artist would agree that his entire work can be classified into one single category. 2. Genres are too imprecise to guide users through the vast amount of available music (e.g. the iTunes music store categorizes such diverse artists as punkrockers Anti-Flag and singer/songwriter James Blunt into the genre ”Rock”). 3. Genres are very little help to users who want to explore and discover new and unknown music libraries, especially if the artist name is unknown or hard to classify into one of the existing categories. 4. Genres, in general, don’t match very well with our moods, e.g. a song from the category rock could be a slow and calm ballad or a fast, rough and loud song. A major reason for these problems is the imprecise nature of the whole genre concept. With this concept, attempts to classify music often fail because of reasons like ambiguities, subjective judgment and marketing interests. In general, there is a conflict between the broad variety of music (and music properties) and the relatively rigid and error-prone classification system. The fact that meta information is stored in ID3 tags [17], which are created and applied by humans, adds to this problem. In real life most ID3 tags are obtained via online databases like Gracenote CDDB or FreeDB, which are created and maintained by a large community of volunteers. This information is very useful in many scenarios (e.g. displaying song title, album and duration), but there is no quality assurance and, in fact, genre information is often incorrect. For music classification it is a problem that the information is assigned to the music and not derived from the music. In response to the problems above, we propose a radically different approach for organizing, browsing and listening to digital music, which is based on two main steps: 1. Instead of relying on meta information, we analyze the music itself, derive a number of meaningful descriptive features from it, and organize the music library by the similarity between songs according to these features. 2. Using this analysis we create a graphical representation for all songs in the library based on similarity. Our visualization employs a radar metaphor as a coherent conceptual model, where similar songs are grouped close together, and the user navigates a musical seascape. This allows users to surf through their music library (or a music store) guided by similarity instead of scrolling through endless lists. Our prototype is a new digital music player called AudioRadar. Currently the player has two main functionalities; library browsing and a playlist editor. Both parts of the application are centered around the properties of the actual music and their similarity. The browser resembles a ship’s radar, and the current song is the centroid and similar songs are grouped around it. So a user can immediately understand that nearby songs are similar to the active song but a bit faster/slower or rougher/calmer and so on. The distance from the centroid (along the according dimensions axis) shows how different the songs are. In the playlist editor users can choose from several dimensions (e.g. speed, rhythm, tone) and specify a range of values she wants to have in her playlist. Thus users can effectively create playlists that suit their mood. This allows the user to create, for example, a playlist containing songs that are relatively slow and calm. 2 Related Work and Contribution Two different aspects need to be addressed in our discussion of related work to the AudioRadar system; the extraction of musical features and the visualization of the music collection. Our claim, that the automatic extraction of features from musical data can improve music browsing, is backed up by a number of projects in the music information retrieval community, and an overview of MIR systems is given in Typke et al. [22]. Classification mechanisms range from Metadata-based via collaborative filtering approaches to purely feature-based approaches. McEnnis et al. [15] present a library for feature extraction from musical data and discuss other similar work. Liu et al. [13] propose a method for mood detection from low level features, and Li and Sleep [12] as well as Brecheisen et al. [7] even derive genres from low level features. The Music Genome Project [27] relies on features entered by human listeners to classify music, but uses a collaborative filtering approach to create coherent playlists. Uitdenbogerd and van Schyndel [23] discuss collaborative filtering approaches for music information retrieval and how they are influenced by different factors. Schedl et al. [20] propose to use the co-occurrence of artists on Web pages as a measure of similarity and derive a degree of prototypicality from the number of occurrences. Berenzweig et al. [5] give an overview of similarity measures and discuss how subjective they are, and Ellis et al. [9] question whether there even is a ground truth with respect to musical similarity, but try to provide a number of viable approximations. We do not claim to make a technical contribution in the actual analysis of music, but rather use known methods for extracting the features used in our visualization. The second aspect of our work is the actual visualization of the music collection. The Information visualization community has come up with a number of ways to present big data sets interactively. Classical examples are Starfield displays and scatter plots. The Film Finder[1] applies the Starfield concept to a movie database with several thousand entries. The motivation behind this work is exactly the same as ours, namely to browse and navigate a complex and high-dimensional space according to some meaningful criteria. The MusicVis system[6] uses a scatterplot-like display. It arranges songs as grey, green or blue blobs in a plane and determines proximity between them by their co-occurrence in playlists. MusicVis can also create playlists from its database, which represent coherent subsets of the music collection with familiar song sequences. Fig. 1. Left: a real radar screen showing own ship as centroid, green dots symbolize other vessels. Right: the AudioRadar application in browsing mode. The active song is the center point, similar songs are grouped around it. The Liveplasma Web site [25] presents a graphical interface to a musician and to a movie database. Starting from a search term it presents the closest match in the center of the zoomable display and groups similar artists or movies around it using a spring model based layout mechanism. By clicking on another artist, this one becomes the new centroid and the similar neighbors are dynamically rearranged. Torrens et al. [21] describe visualizations of a personal music collection in the shape of a disc, a rectangle or using a tree map. Vignoli et al. [24, 26] present the artist map, a space-conserving visualization for music collections for use on a PDA screen. Our work goes beyond these existing approaches in providing a coherent mental model, the radar metaphor, for the actual visualization as well as for the navigation of the music collection. Pampalk et al. [19] propose a visualization of feature-based clusters of music as ”island of music”; but do not provide a means for navigating this seascape. Our main contribution over these existing visualizations of music collections is the provision of a metaphor from the physical world. Most of us have an intuitive understanding of what a radar is and how it works. We understand the spatial mapping, which tells us where different objects around us are, particularly in which direction and how far away. This spatial mapping is perfectly applied in our interface, since the more similar songs are displayed closer, and the direction tells us, in which aspect they differ. While this is also the case in the film finder, Liveplasma or with the islands of music, the radar metaphor conveys the feeling of literally navigating the musical seascape and supports casual meandering and browsing. 3 Navigating the Sea of Sounds The name AudioRadar obviously refers to the metaphor of a ship’s radar, a system that is used to detect, range and map objects such as aircrafts and other ships. In our application we calculate the distance of songs between each other by analyzing the audio stream. We use this information to position songs on a radar-like map where the current song is the centroid (Figure 1). The center area of the AudioRadar player shows the active song and some controls known from standard music players (play, pause, loudness, progress). Radiating out from that centroid are similar songs positioned along four axes. The direction of their offset is determined by the dominant difference from the active song. As shown in Figure 1 this means that ”Gorillaz Feel Good Inc.” is faster than ”50 Cent In Da Club”. The distance from the center symbolizes the difference in similarity. A song on the outer rim of the radar could be, say, 100% faster then the centroid. The concentric circles in the background function as visual aides to help users judge the distance of two songs. By double clicking one of the songs that appear on the radar (or one song from the list on the right) the user can assign the respective song to become the new centroid. The other songs are relocated according to their similarity toward the new centroid. Each of the similar songs further offers a quick-play option that enables the user to just listen to that song without changing the current setup of the songs. We experimented with different strategies to position the secondary songs. First we calculated the mean value of all extracted attributes and placed the songs accordingly. In some cases this leads to misleading and even wrong placements (see Figure 2 (a)). For example a song that is more turbulent than the centroid could end up in the melodic sector of the radar because the slow and melodic attributes had high values as well. But it was our intention to create a design that contains all attribute dimensions at once and still allows the user to comprehend the most significant type of similarity at fist glance. One solution for this problem is to dispose all values but the maximum (see Figure 2 (b)). Thus the placement becomes more coherent with the idea that a song is similar to the current one but only faster, for example. This can lead to visual clutter because songs are only placed on the axes of the radar screen. To avoid this problem we use the second highest value to compute an offset from the axes so that the songs get distributed within the maximum sector (see Figure 2 (c)). Utilizing the second highest value as offset in addition makes the offset meaningful for the user. 3.1 Automatic Audio Analysis To obtain the data for the placement of each song we analyze the actual audio stream. The four extracted attributes describe each song’s position in a fourdimensional feature space. The dimensions are slow vs. fast, clean vs. rough, calm vs. turbulent and melodic vs. rhythmic (see Figure 3). This four-dimensional space is projected onto the two-dimensional display by selecting two of the four dimensions and ignoring the other two (see figure 4). Since the main focus of our work is on the visualization, we used a given analysis library [16] to derive these features. The current results are mostly plausible, but as better algorithms for analysis become available, these can be exchanged in a modular way.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Visualization in Comparative Music Research

Computational analysis of large musical corpora provides an approach that overcomes some of the limitations of manual analysis related to small sample sizes and subjectivity. The present paper aims to provide an overview of the computational approach to music research. It discusses the issues of music representation, musical feature extraction, digital music collections, and data mining techniq...

متن کامل

Handling Scanned Sheet Music and Audio Recordings in Digital Music Libraries

The last years have seen increasing efforts in building up large digital music collections. These collections typically contain various types of data ranging from audio data such as CD recordings to image data such as scanned sheet music, thus concerning both the auditorial and the visual modalities. In view of multimodal searching, navigation, and browsing applications across the various types...

متن کامل

SoniXplorer: Combining Visualization and Auralization for Content-Based Exploration of Music Collections

Music can be described best by music. However, current research in the design of user interfaces for the exploration of music collections has mainly focused on visualization aspects ignoring possible benefits from spatialized music playback. We describe our first development steps towards two novel user-interface designs: The Sonic Radar arranges a fixed number of prototypes resulting from a co...

متن کامل

Content-based Organization of Digital Audio Collections

With increasing amounts of audio being stored and distributed electronically, intuitive and efficient access to large music collections is becoming crucial. To this end we are developing algorithms for audio feature extraction, allowing to compute acoustic similarity between pieces of music, as well as tools utilizing this information to support retrieval of as well as navigation in music repos...

متن کامل

Strike-A-Tune: Fuzzy Music Navigation Using a Drum Interface

Project Goals • "Fuzzy" Navigation • Physically and Visually Intuitive • Large Libraries A traditional music library system controlled by a mouse and keyboard is precise, allowing users to select their desired song. Alternatively, randomized playlist or shuffles are used when users have no particular music in mind. We present a new interface and visualization system called Strike-A-Tune for fuz...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2006

AudioRadar: A Metaphorical Visualization for the Navigation of Large Music Collections

نویسندگان

چکیده

منابع مشابه

Visualization in Comparative Music Research

Handling Scanned Sheet Music and Audio Recordings in Digital Music Libraries

SoniXplorer: Combining Visualization and Auralization for Content-Based Exploration of Music Collections

Content-based Organization of Digital Audio Collections

Strike-A-Tune: Fuzzy Music Navigation Using a Drum Interface

عنوان ژورنال:

اشتراک گذاری